ADHOC: a Tool for Performing Effective Feature Selection
نویسندگان
چکیده
This paper introduces ADHOC, a tool that integrates statistical methods and machine learning techniques to perform effective feature selection. Feature selection plays a central role in the data analysis process since redundant and irrelevant features often degrade the performance of induction algorithms, both in speed and predictive accuracy. ADHOC combines the advantages of both filter and feedback approaches to feature selection to enhance the understanding of the given data and increase the efficiency of the feature selection process. We report results of extensive experiments on realworld data which demonstrate the effectiveness of ADHOC as data reduction technique as well as feature selection method. ADHOC has been employed in the analysis of several corporate databases. In particular, it is currently used to support the difficult task of early estimating the cost of software projects.
منابع مشابه
Performing Effective Feature Selection by Investigating the Deep Structure of the Data
This paper introduces ADHOC (Automatic Discoverer of Higher-Order Correlation), an algorithm that combines the advantages of both filter and feedback models to enhance the understanding of the given data and to increase the efficiency of the feature selection process. ADHOC partitions the observed features into a number of groups, called factors, that reflect the major dimensions of the phenome...
متن کاملMental Arithmetic Task Recognition Using Effective Connectivity and Hierarchical Feature Selection From EEG Signals
Introduction: Mental arithmetic analysis based on Electroencephalogram (EEG) signal for monitoring the state of the user’s brain functioning can be helpful for understanding some psychological disorders such as attention deficit hyperactivity disorder, autism spectrum disorder, or dyscalculia where the difficulty in learning or understanding the arithmetic exists. Most mental arithmetic recogni...
متن کاملOptimal Feature Selection for Data Classification and Clustering: Techniques and Guidelines
In this paper, principles and existing feature selection methods for classifying and clustering data be introduced. To that end, categorizing frameworks for finding selected subsets, namely, search-based and non-search based procedures as well as evaluation criteria and data mining tasks are discussed. In the following, a platform is developed as an intermediate step toward developing an intell...
متن کاملA Random Forest Classifier based on Genetic Algorithm for Cardiovascular Diseases Diagnosis (RESEARCH NOTE)
Machine learning-based classification techniques provide support for the decision making process in the field of healthcare, especially in disease diagnosis, prognosis and screening. Healthcare datasets are voluminous in nature and their high dimensionality problem comprises in terms of slower learning rate and higher computational cost. Feature selection is expected to deal with the high dimen...
متن کاملOptimal Feature Selection for Data Classification and Clustering: Techniques and Guidelines
In this paper, principles and existing feature selection methods for classifying and clustering data be introduced. To that end, categorizing frameworks for finding selected subsets, namely, search-based and non-search based procedures as well as evaluation criteria and data mining tasks are discussed. In the following, a platform is developed as an intermediate step toward developing an intell...
متن کامل